Query-Focused Association Rule Mining for Information Retrieval
نویسندگان
چکیده
We present a method that applies association rule mining for information retrieval. Our approach is different from traditional information retrieval since retrieval is done based on association rather than similarity, which might be useful for knowledge discovery purposes such as finding an explanation or elaboration for an event in a collection of domain-specific documents. The method proposed in this paper is based on the SmoothApriori algorithm which accommodates similarity in the association rule mining process to mine association rules between sentences or larger text units. We introduce query-focused association rule mining that allows association-based retrieval from larger amount of data than with a traditional association-rule mining approach. Combined with SmoothApriori, query-focused association rule mining provides association-based retrieval for textual data. This new method was evaluated on the task of automatically restoring sentences that were artificially removed from aviation investigation reports and showed significantly better results than any of our similarity-based retrieval baselines.
منابع مشابه
Relations between Terms Discovered by Association Rules
This paper presents an approach to automatic knowledge base construction used for automatic query expansion. We have designed a system that is able to gain domain knowledge from analysed texts. We have used a Data Mining technique called association rules. We show how this technique is able to discover knowledge about relations between terms. We study how these relations can be used to improve ...
متن کاملQuery Expansion for Document Retrieval by Mining Additional Query Terms
In this paper, we present a new query expansion method for document retrieval by mining additional query terms. The proposed query expansion method uses the vector space model to represent documents and queries. It uses the degrees of importance of relevant terms for finding additional query terms and uses fuzzy rules to infer the weights of the additional query terms. Then, these additional qu...
متن کاملMining Term Association Rules for Global Query Expansion: A Case Study with Topic 202 from TREC4
The sudden growth of the World Wide Web and its unprecedented popularity as a de facto global digital library exemplified both the strengths and weaknesses of the Information Retrieval techniques used by popular search engines. Most queries are short and incomplete attempts to describe or characterize the possible documents relevant to the query. It seems then natural to try and expand the quer...
متن کاملGroup Bitmap Index: A Structure for Association Rules Retrieval
Discovery of association rules from large databases of item sets is an important data mining problem. Association rules are usually stored in relational databases for future use in decision support systems. In this paper, the problem of association rules retrieval and item sets retrieval is recognized as the subset search problem in relational databases. The subset search is not well supported ...
متن کاملExtracting and Evaluating Knowledge from e-Health Documents: A Contribution to Information Retrieval and Indexing
The Internet is a major source of biomedical information. This chapter presents a simple yet efficient approach for extraction of information from biomedical documents available on the Internet. The main objective here is to re-use the information extracted during document retrieval and document indexing. In this work, healthinformation seekers are categorized into three main profiles according...
متن کامل